Hardware-based pre-page walk virtual address transformation

ABSTRACT

An indication of a virtual address is received. A current page size of a plurality of available page sizes is read from a register. A shift amount is determined based, at least in part, on the current page size. A bit shift of the virtual address is performed in which the virtual address is bit shifted by, at least, the determined shift amount.

RELATED APPLICATIONS

This application claims the benefit of U.S. of America patentapplication Ser. No. 13/834,739 filed Mar. 15, 2013.

BACKGROUND

Embodiments of the inventive subject matter generally relate to thefield of computing systems, and, more particularly, to virtual memory.

Virtual memory allows computing systems to better manage memory than ifthe computing system was limited to managing the actual memoryaccessible by the computing system. For example, virtual memoryfunctionality allows the computing system to allocate non-contiguousregions to a particular application, while giving the application theappearance of being allocated one contiguous region of memory.Additionally, virtual memory functionality allows a computing system toallocate more memory than is actually available in the computing systemto applications. Whenever an application references a memory address ona system that implements virtual memory, the address is translated intoa physical address that refers to the actual memory location. Becauseapplications frequently interact with memory, inefficiency andinflexibility within the virtual memory system can cause performancedegradation.

SUMMARY

Embodiments of the inventive subject matter generally include a methodfor virtual address transformation. The method includes receiving anindication of a virtual address. The method also includes reading, froma register, a current page size of a plurality of available page sizes.The method also includes determining a shift amount based, at least inpart, on the current page size. The method also includes performing abit shift of the virtual address to create a transformed virtual addressin which the virtual address is bit shifted by at least the determinedshift amount.

BRIEF DESCRIPTION OF THE DRAWINGS

The present embodiments may be better understood, and numerous objects,features, and advantages made apparent to those skilled in the art byreferencing the accompanying drawings.

FIG. 1 depicts the components and operations of a processor thatsupports hardware-based virtual address transformation and dynamic pagesizing.

FIG. 2 depicts a flowchart of example operations for implementing avirtual address transformation unit.

FIG. 3 depicts the components of an embodiment of a shift register-basedtransformation unit.

FIG. 4 depicts a flowchart of example operations for implementing atransformation unit using a barrel shifter.

FIG. 5 depicts an example computer system including a virtual addresstransformation unit.

DESCRIPTION OF EMBODIMENT(S)

The description that follows includes example systems, methods,techniques, instruction sequences and computer program products thatembody techniques of the present inventive subject matter. However, itis understood that the described embodiments may be practiced withoutthese specific details. For instance, although examples refer to 64-bitmemory architectures, the inventive subject matter can be implemented onmemory architectures utilizing any size memory addresses. In otherinstances, well-known instruction instances, protocols, structures andtechniques have not been shown in detail in order not to obfuscate thedescription.

Computing systems are generally implemented such that the smallest unitof memory that can be allocated to an application is a “page.” The sizeof a page can vary between implementations. For example, while a fourkilobyte (4 KB) page is common, a page can be defined as any size.Additionally, some implementations allow selectable page sizes, such asallowing a 4 KB “small” page and a 64 KB “large page.” Although notallowing smaller sized regions of memory to be allocated can result inmemory fragmentation, the design of a memory architecture utilizingpages can result in increased efficiency that offsets the memoryfragmentation.

Virtual memory allows computing systems to more efficiently allocatememory. Each application or process (hereinafter “application”) isassociated with a virtual address space. The virtual address spaceappears, to the application, as a contiguous set of memory. A page tablemaps the pages of the virtual address space to pages in the physicalmemory, which may not be contiguous. Virtual memory also allows thecomputing system to provide more memory to the application than isactually available in the system. For example, in a system that has 4 GBof random access memory (RAM) installed, multiple applications can beallocated 4 GB or more of virtual memory. This is possible because mostapplications do not actively use all available memory at once, allowingthe computing system to “page out” inactive regions to other storage,such as hard disks.

In some embodiments, a computing system can include multiple pagetables, such as one for each application. The computing system caninclude configuration data that indicates where the page table for aparticular application is located in memory, allowing the computingsystem to reference the proper page table depending on which applicationis currently being executed. At the most basic, a page table containspage table entries that map virtual memory pages to physical memorypages. Page tables can also include metadata about the individual pages,such as what process the page belongs to, whether the page is designatedas read-only, etc. Page tables can also be implemented in a variety ofways, such as inverted page tables, hashed page tables, radix pagetables, etc.

When an instruction associated with an application references a virtualmemory address (hereinafter “virtual address”), the computing systemtranslates the virtual address to a physical memory address (hereinafter“physical address”). The techniques used to translate the virtualaddress to a physical address vary between implementations. For example,in the basic page table described above, the computing system searchesfor the virtual address in the page table (referred to as a “page tablewalk”). Upon finding the virtual address, the computing system reads thephysical address associated with the page table entry. Other page tableimplementations can be more complex. For example, a hashed page tablecan include using a hash function to generate a hash of the virtualaddress, an index lookup using the hashed value, and a page table walkof an inverted page table using the index value.

Virtual addresses can be broken up into multiple components. Forexample, a virtual address can be defined to include a page identifierand a page offset. The page offset specifies the specific byte withinthe page that is being accessed, whereas the page identifier specifieswhich page the byte is located in. Thus, when doing a page table walk,the page identifier portion of the virtual address can be used to findthe page table entry. The page offset would then be concatenated withthe physical address in the page table entry to get the completephysical address.

Due to the potential size of a virtual address space, a page table canget very large. Page tables can be broken up into more manageable sizesby using a multilevel page table. For example, if the memoryarchitecture is designed to work most efficiently with 4 KB page sizes,the page table can be designed to have multiple, hierarchical levels of4 KBs. Each page table entry is a reference to another page table in thehierarchy, until the last level, which references the physical address.In such implementations, the page identifier portion of a virtualaddress can be further defined to include sub-identifiers thatcorrespond to particular levels of the page table.

Page tables can be implemented to support multiple page sizes. This isfacilitated by having the page table managed by the operating system.Because the operating system is software, more flexibility is affordedto it than a hardware implementation. For example, as the page sizeincreases, the page offset portion of the virtual address increases insize while the page identifier portion becomes smaller. Software isbetter adapted to handle such variations. Additionally, in a multilevelpage table, fewer levels are used when the page size increases. Thus,the transformation of the virtual address into a form used by the pagewalk mechanism can change depending on the page size. However, eachoperation performed by the operating system increases the amount of timeused to translate the virtual address to a physical address. Performingone or more of the operations in hardware can increase the efficiency ofthe memory architecture.

Thus, the memory architecture of a computing system can be designed totransform a virtual address based on the configured page size prior tothe page walk using hardware. This reduces the number of operationsperformed in software, thus increasing the efficiency of the translationprocess. Further, if the page table walk mechanism functionsindependently of the page size, the complexity of the software isreduced while increasing compatibility.

FIG. 1 depicts the components and operations of a processor thatsupports hardware-based virtual address transformation and dynamic pagesizing. FIG. 1 depicts a subset of a processor 100, including a virtualmemory configuration register 102 and a virtual address transformationunit 106. The virtual address transformation unit 106 includes a shiftregister 108. FIG. 1 also depicts a virtual address 104, a pageidentifier 110 and a page offset 112.

It should be noted that the descriptions herein exclude some low leveldetails that have only minor impacts on implementations. For example,some examples may refer to 64-bit memory addresses, but this is just forillustration and the inventive subject matter can apply to any sizememory address. It is also assumed that the memory architecture isbyte-addressable, but the inventive subject matter can apply even inimplementations that are not byte-addressable.

At stage A, a virtual address 104 is referenced and sent to the memoryarchitecture for transformation and translation. The memory architecturegenerally exists as a component on the processor, but can also exist asa separate component or combined with other components. The virtualaddress 104 is received by the virtual address transformation unit(hereinafter “transformation unit”) 106. A virtual address 104 can bereferenced by various components and processes within a processor. Forexample, a load instruction can reference a virtual address 104. If thedata located at the virtual address 104 is not also located in thecache, the processor fetches the data from memory. To fetch the datafrom memory, the processor sends the virtual address 104 to the memoryarchitecture, which translates the virtual address 104 into a physicaladdress. A processor can also perform context switches, in which oneprocess or thread running on the processor is exchanged for anotherprocess or thread. During a context switch, the processor may store thestate of the process or thread that is being idled to memory, and thusmay reference the virtual addresses associated with the process orthread data.

The size of the virtual address 104 can vary between architectures. Forexample, a 32-bit architecture would generally have a maximum virtualaddress size of 32-bits, while a 64-bit architecture would generallyhave a maximum virtual address size of 64-bits. Architectures can alsofurther restrict the size of the virtual address 104 for purposes suchas including bits representing metadata. Various ranges of bits in thevirtual address 104 can serve different functions as well. As describedabove, the virtual address 104 can include a set of bits representingthe page identifier and a set of bits representing the page offset. Insome embodiments, as depicted here, the virtual address 104 is definedas having a length of i plus o bits, where i is the number of bitsrepresenting a page identifier and o is the number of bits representingthe page offset. In the embodiment depicted in FIG. 1, o low order bitsof the virtual address 104 represent the page offset 112, while theremaining i bits represent the page identifier 110. The specific numberof bits in the virtual address 104 that represent the page offset 112and page identifier 110 will vary based on the current page size, asdescribed below.

At stage B, the transformation unit 106 reads configuration data fromthe virtual memory configuration register (hereinafter “configurationregister”) 102. The configuration data can include various settings suchas the memory address for the first level of the page table and the pagesize. The transformation unit 106 determines the page size based on thedata stored in the configuration register 102. The technique used todetermine the page size can vary between implementations. For example,the transformation unit 106 may utilize bit masking and bit shifting toisolate the specific bits representing the page size, or may read onlythe specific bits that represent the page size.

The page size can be stored in the configuration register 102 in variousways. For example, the page size can be stored as the number of bits toshift the virtual address 104 (described in more detail in relation tostage C). The page size can also be stored as a value that is associatedwith a specific page size. For example, 0b000 may represent 4 KB pagesizes, 0b001 may represent 8 KB page sizes, 0b010 may represent 16 KBpage sizes, etc.

The configuration register 102 includes at least one write input thatallows one or more components to write a value to the configurationregister 102. For example, the operating system can be allowed to changethe values in the configuration register 102 by writing to theconfiguration register 102. The operating system can also be restrictedto writing only certain parts of the configuration register 102. Theprocessor may write to the configuration register 102 when switchingprocesses or threads in order to change the address of the first levelof the page table.

The configuration register 102 is one representation of many possibleimplementations for the maintaining of the configuration data. Someother variations include storing each configuration setting inindividual registers instead of combining them into one, storing theconfiguration settings in other storage mechanisms (such as latches),storing the configuration settings in system memory, etc. Thus theinventive subject matter is not limited to embodiments with aconfiguration register as depicted above.

At stage C, the transformation unit 106 transforms the virtual address104 based on the configured page size. The technique(s) used totransform the virtual address 104 can vary between implementations basedon many factors, such as the design and implementation of the memoryarchitecture and the desired layout of the memory address. For example,some bits available for the virtual address 104 can instead be used formetadata. Metadata bits may be removed using a variety of techniques,such as bit shifting or bit masking, during the transformation of thevirtual address 104.

As described above at stage A, o low order bits of the virtual address104 represent the page offset 112 and the remaining i bits represent thepage identifier 110. The page identifier 110 is used to identify thememory page in which the data resides, while the page offset 112 refersto the specific byte within the memory page that contains the data.However, if the page size changes, the number of bits representing thepage identifier 110 and the number of bits representing the page offset112 changes. For example, 4 KB pages include 4096 individual bytes,meaning the page offset 112 is represented by twelve bits. In otherwords, if the page size is 4 KB, o is twelve. The remaining bits of a64-bit memory address, or fifty two bits, represent the page identifier110. However, if the page size is 64 KB, each page includes 65536 bytes.To represent 65536 bytes, the page offset 112, and o, is sixteen bits.

In order to provide the system component that walks the page table withthe appropriate page identifier 110 and page offset 112, thetransformation unit 106 is implemented to isolate the page identifier110 from the page offset 112 (as well as any other bits, such asmetadata bits) based on the page size. With virtual addresses defined asin the depicted embodiment, a shift register 108 can be used totransform the virtual address 104. The shift register 108 is implementedsuch that it shifts the virtual address 104 by an amount specified bythe page size configuration data stored in the configuration register102. As detailed in reference to stage B, the page size configurationdata can be stored in different ways. For example, as detailed above,the shift register 108 can be implemented to shift the virtual address104 by one bit for each bit in the page size configuration data that isset to one. Thus, if the page offset 112 is twelve bits, the page sizeconfiguration data would be set to a value that includes twelve bits setto one.

As also detailed above, the page size configuration data can also bestored as individual values that are associated with specific pagesizes. For example, 0b000 may represent 4 KB page sizes, 0b001 mayrepresent 8 KB page sizes, 0b010 may represent 16 KB page sizes, etc.The actual implementation can vary. In implementations where the pagesizes increase by powers of two, as just described, the shift registercan be implemented to shift the virtual address 104 by a set base amountplus an extra amount corresponding to the page size configuration data.For example, if 0b000 represents 4 KB page sizes, the shift register 108can be implemented to always shift the virtual address 104 by a minimumof twelve bits. The shift register 108 then additionally shifts by thenumber of bits represented by the page size configuration data. If thepage size configuration data is set to 0b000, which represents the valuezero, no additional shifting is done, resulting in a shift of twelvebits. If the page size configuration data is set to 0b001, whichrepresents the value one, the virtual address is shifted by oneadditional bit, resulting in a shift of thirteen bits. Inimplementations where the page size configuration data corresponds toarbitrary values, a mapping table or other similar mechanism can beimplemented to translate the value stored in the page size configurationdata to the appropriate number of bits to shift.

The bits that are shifted out of the virtual address 104 represent thepage offset 112. The shift register 108 can be implemented in variousways. For example, the shift register can include a barrel shifter and aregister designated to store the page offset. The shift register 108 canalso include a shifter with a parallel input and a serial output, theserial output being connected to a register with a serial input. As bitsare shifted out of the virtual address 104, the bits are transmitted tothe register via the serial input. The page offset can then be read fromthe register in the shift register 108 implementation. Because the pageoffset is variable size, the transformation unit 106 can pad the pageoffset portion of the output with zeros as appropriate. Furthermore, theshift register 108 can be implemented to either zero fill the mostsignificant bits as the least significant bits are shifted off orimplemented to maintain a sign bit. For example, if the virtual address104 is a signed value, the shift register 108 can maintain the propersign by propagating copies of the sign bit.

The transformation of the virtual address 104 is independent of theparticular page table type. Thus, the transformation unit 106 transformsthe virtual address 104 into a format that is compatible with thevarious page table types that are supported by the computing system. Inother words, the page table type, and thus the page walk technique, isnot dependent on the current page size, allowing the computing system toselect any of a plurality of page sizes and any of a plurality of pagetable types, irrespective of each other. This can allow for allpermutations of page size and page table types to be selected, dependingon the specific implementation. The particular page table walk techniquecan be viewed as accepting a virtual address (or page identifier and/orpage offset) as an input. As long as the provided input fits aparticular defined template, the page table walker can utilize thevirtual address, regardless of the particular page size and techniqueused to transform the original virtual address.

After the virtual address 104 is transformed, the page identifier 110and the page offset 112 have been isolated from each other. The pageidentifier 110 consists of i bits, while the page offset 112 consists ofo bits. In some implementations, the page identifier 110 and the pageoffset 112 can be recombined into a single register, allowing the pagetable interface to read the values from a single location.

At stage D, the page table interface reads the page identifier 110 andthe page offset 112. The transformation of the virtual address 104 intoa page identifier 110 and page offset 112 is transparent to the pagetable interface. As described above, various bits in the page identifier110 can be used to reference different levels of the page table. Forexample, assume a computer architecture defines 4 KB pages as thesmallest page size. Further assume that the virtual address 104 is64-bits, but sixteen bits are not used for the actual address. If thepage table size was set to 4 KB, the resulting page offset 112 wouldcontain twelve bits, while the page identifier 110 would containthirty-six bits. The page identifier 110 could further be defined ascontaining four sets of nine bits, with each set corresponding to alevel in the page table.

The page table interface is any component that consumes the pageidentifier and page offset, or any combination of components thatconsumes the page identifier and/or the page offset. The specificcomponent can vary between implementations. For example, in someembodiments, the page table interface can be a component that performsadditional transformation and/or translation to the page identifierand/or page offset. In some embodiments, the page table interface is thecomponent that performs the page table walk. Additionally, the pagetable interface can be software, hardware or a combination thereof.

FIG. 2 depicts a flowchart of example operations for implementing avirtual address transformation unit.

At block 200, the transformation unit receives the virtual address. Theactual implementation of how the transformation unit receives thevirtual address can vary. For example, another component may write thevirtual address to an input of the transformation unit. Or, anothercomponent may write the virtual address to a particular register, whilealso notifying the transformation unit that the virtual address waswritten to the register. The transformation unit can then read thevirtual address from the register. After the transformation unitreceives the virtual address, control then flows to block 202.

At block 202, the transformation unit reads the configurationinformation to determine the current page size. As described above, thepage size can be stored in a variety of forms, including as part ofseveral different data elements stored in a single register or stored ina register dedicated solely to the page size configuration data. Thus,the technique used to read the configuration information can varybetween implementations, and can include reading a value from aregister, reading a value from a register then masking and/or bitshifting the value, performing a partial read of a register, interactingwith another component, such as the system memory or operating system,etc. After reading the configuration information and determining thepage size, control then flows to block 204.

At block 204, the transformation unit transforms the virtual addressbased on the current page size. The transformation performed by thetransformation unit can vary between implementations. As describedabove, the transformation unit can be implemented to shift the bits ofthe virtual address to obtain a page offset and page identifier. Thetransformation unit can also be implemented to pad one or both of thepage offset and page identifier values, reverse the values, mask outcertain bits and transform the virtual address into additionalcomponents beyond the page offset and page identifier. Aftertransforming the virtual address, control then flows to block 206.

At block 206, the transformation unit provides the page offset and pageidentifier to the page table interface. Similar to receiving the virtualaddress, the techniques used to provide the page offset and pageidentifier to the page table interface can vary. For example, the pageoffset and page identifier can be written to inputs of the page tableinterface. The page offset and page identifier can be stored in one ormore registers in the transformation unit, which the page tableinterface can read from after being notified by the transformation unitthat the values are ready to be read.

FIG. 3 depicts the components of an embodiment of a shift register-basedtransformation unit. FIG. 3 depicts a transformation unit 300, includinga page size specific shift amount register 304 and a base shift amountregister 306. The transformation unit 300 also includes a barrel shifter308, a register 310 and an adder 312. The barrel shifter 308 andregister 310 can constitute the shift register 108 depicted in FIG. 1.

The transformation unit 300 is depicted above as reading theconfiguration data from the configuration register. However, in someembodiments, the transformation unit 300 does not explicitly read thedata from the configuration register. Instead, any time the data changesin the configuration register, the updated configuration data is writtento the page size specific shift amount register 304. Some embodimentsutilize a base shift amount. In such embodiments, the base shift amountcan be stored in a register, such as the base shift amount register 306.This allows additional flexibility when the page size increments are notpowers of two. For example, as described above, if in a particularimplementation the minimum page size is 4 KB and each successive pagesize is double the previous page size, the base shift amount registercan be set to be twelve. Then, the value representing each page size isadded to the base shift amount to determine the number of bits to shiftthe virtual address. If the minimum page size is 4 KB, which isrepresented by 0b000, and the next page size is 16 KB, represented by0b001, the previous technique will not work. However, if the page sizeis configured to be 16 KB, the value thirteen can be written to the baseshift amount register 306, thus accomplishing the appropriate fourteenbit shift.

The barrel shifter 308 includes multiple levels of multiplexors. Theoutputs of each level are connected to the inputs of the next level in amanner that allows for each level to either shift the output from theprevious level by an additional power of two or pass the output straightthrough to the next level without shifting. Thus, if the barrel shifter308 has five levels, the maximum shift amount is thirty one bits, whilesix levels allows a maximum shift amount of sixty three bits. Thespecific number of barrel shifter 308 levels will vary betweenimplementations based on the maximum shift value. For example, if themaximum page offset is less than thirty two bits, a five level barrelshifter 308 is sufficient. The output from the adder 312 activates thevarious levels of the barrel shifter by selecting the shift orpass-through multiplexor input, with the low order bit activating thefirst level, next highest order bit activating the second level, etc.The barrel shifter 308 allows for constant-time bit shifts, thus anyvalue written to the barrel shifter 308 inputs is shifted by the currentshift amount and output to the register 310.

The shifted value is written into the register 310. The transformationunit 300 can pad the values as appropriate either as part of theshifting process or output of the values from the register 310. Theregister 310 can include two outputs as depicted, with one outputcorresponding to the bits associated with the page offset and the otheroutput corresponding to the bits associated with the page identifier.The register 310 can also include a single output, in which case thereading component is responsible for separating the page identifier andpage offset values. Other implementation variations are possible. Forexample, the register 310 can instead be two individual registers, orthe register 310 can be excluded, thus writing the shifted value to theinput of any connected component. The actual bit width of the pageidentifier output and page offset output would be implemented to allowfor the maximum number of bits to be sent based on the smallest andlargest page table size. For example, if the smallest page table sizewas 4 KB, the page identifier output would be 52-bits wide in a 64-bitarchitecture. If the largest page table size was 64 KB, the page offsetoutput would be 16-bits wide.

FIG. 4 depicts a flowchart of example operations for implementing atransformation unit using a barrel shifter.

At block 400, the transformation unit receives an indication that avirtual address needs transformation. Depending on the implementation,it can be possible that multiple virtual addresses need transformationat the same time or the transformation process takes multiple processorcycles. A queue can be implemented to hold multiple virtual addressesthat need to be transformed. Regardless of the specific implementation,the storage mechanism for virtual addresses that need to be transformedcan notify the transformation unit when a virtual address is storedtherein. Alternatively, the transformation unit can set a bit in thestorage mechanism, indicating that the currently stored address has beenread. When a new virtual address is written to the storage mechanism,the indicator bit is flipped. The transformation unit can check theindicator bit periodically, such as each cycle the transformation unitis not operating on a virtual address. After receiving the indicationthat the virtual address needs transformation, control then flows toblock 404.

At block 404, the page size is read from the configuration data. Asdescribed above, the configuration data can be stored in a variety ofways, so the implementation can vary accordingly. For example, in someembodiments, the read may be a read of a register that stores theconfiguration data or a partial read of a register that contains thedata for multiple configuration values. Additionally, the page size maynot be actively read, but may be written anytime the configurationchanges. After reading the page size from the configuration data,control then flows to block 406.

At block 406, page size specific shift amount is determined. The pagesize specific shift amount is the shift amount that is added to the baseshift amount to determine the total shift amount. As described above,this can be accomplished in a variety of ways. For example, a mappingtable can be utilized if there is little relationship between theconfiguration value representing the page size and the shift amount. Ifthere is a relationship between the configuration value representing thepage size and the shift amount, the page size specific shift amount canbe calculated based on the configuration value by utilizing therelationship. After the page size specific shift amount is determined,control then flows to block 408.

At block 408, the page size specific shift amount is added to the baseshift amount to calculate the total shift amount. The base shift amountcan be the shift amount for the smallest page size, or can be adjustedbased on the configured page size, as described above. After calculatingthe total shift amount, control then flows to block 410.

At block 410, the virtual address is shifted by the total shift amountand the shifted value is stored in a register. Because a barrel shiftershifts the input value in constant-time, the transformation unit canprevent the value from reaching the input of the barrel shifter toprevent the overwriting of a value in the register. The register canalso include a write input, which is activated to write a value to theregister. Instead of controlling the input into the barrel shifter, thetransformation unit can then control whether the register is writable ornot to prevent the overwriting of a value stored in the register. Asdescribed above, the shifting of the virtual address can be implementedto preserve the sign of the original virtual address, or can merely padzeros as appropriate. After shifting the virtual address by the totalshift amount and the shifted value is stored in the register, controlthen flows to block 412.

At block 412, the transformation unit notifies the page table interfacethat a page identifier and page offset are available for translationinto a physical address. The technique used to notify the page tableinterface can vary between implementations. For example, thetransformation unit can write the page identifier and page offset tospecified registers. Or, the transformation unit can set a specific bitto a defined value, indicating that the page table interface can readthe value from the transformation unit. After notifying the page tableinterface that the page identifier and page offset are available fortranslation, the process ends.

Memory can also be divided into segments as well. A segment is a blockof memory comprising a set of pages. Thus, each segment has its own pagetable. Different segments can have different page sizes. Theconfiguration data can include a value pointing to the base memoryaddress for the page table associated with the currently active segmentof memory. The configuration data also includes the page size specificto the currently active segment of memory, in accordance with thedescriptions above.

The page sizes described above may include minimum page sizes for aparticular page table or set of page tables. For example, if the minimumconfigured page size is 4 KB, the memory architecture can be implementedto allow for page sizes that are combinations of 4 KB pages. Forexample, tree-based page tables, such as a radix page table, can supportmultiple page sizes in the same page table. A particular branch withlarger page sizes uses fewer levels than branches with smaller pagesizes. Metadata associated with the page table entry can indicatewhether the physical address associated with the page table entry is apointer to another level of the page table or the physical address ofthe actual data referenced by the virtual address.

In some embodiments, the transformation unit reads the current page sizefrom a storage description register. The storage description registercan also include other metadata, such as the physical address of thecurrent page table and the size of the current segment. The storagedescription register is updated to modify the page size, segment sizeand physical address of the page table as appropriate. For example, thephysical address of a page table associated with a first process willgenerally differ from that of the physical address of a page tableassociated with a second process. Thus, when a context switch isperformed, the physical address of the page table is generally updated.

The transformation unit receives a virtual address in the form of asegment page number with a page offset (also known as a “byte offset”)concatenated to it. The segment page number is the page identifier forthe particular memory segment that is active. The transformation unitdetermines the size (number of bits) of the page offset based on thecurrent page size stored in the storage description register. The sizeof the page offset can also be determined based on the segment size. Thetransformation unit then utilizes a shift register to perform a rightbit shift, right shifting the virtual address a number of bits equal tothe size of the page offset. The result of the bit shift is a normalizedpage number and the page offset as two separate values. The normalizedpage number is then provided to a radix table walker, which utilizes thenormalized page number to walk a radix page table. The radix tablewalker finds the associated physical address for the physical page thedata is located in. The page offset is then concatenated to the physicalpage address by the table walker or another component to generate aphysical address associated with the data referenced by the virtualaddress.

Although the embodiments described above describe using a shiftregister, the inventive subject matter is not so limited. The actualimplementation of a virtual address transformation unit can vary basedon the implementation of the page table. For example, the transformationunit can be implemented to isolate the individual sub-identifiers formultilevel page tables instead of only isolating the entire set. Thetransformation unit can also be implemented to perform differentoperations on the virtual address based on metadata bits in the virtualaddress itself or the related configuration data. For example, the highorder bit of a virtual address can be defined to indicate whether avirtual address includes metadata bits or not. In other words, if thevirtual address high order bit is set to one, the virtual addresscontains ten bits of metadata, whereas if the high order bit is set tozero, the virtual address contains zero bits of metadata. Thetransformation unit can be implemented such that if the high order bitis set to one, the metadata bits are not included in the pageidentifier.

As example flowcharts, the flowcharts depicted above present operationsin an example order from which embodiments can deviate (e.g., operationscan be performed in a different order than illustrated and/or inparallel). For example, instead of reading the page size configurationdata after writing the virtual address into the shift register asdepicted in FIG. 4 at blocks 404 and 402, respectively, thetransformation unit can read the page size configuration data first.

As will be appreciated by one skilled in the art, aspects of the presentinventive subject matter may be embodied as a system, method or computerprogram product. Accordingly, aspects of the present inventive subjectmatter may take the form of an entirely hardware embodiment, an entirelysoftware embodiment (including firmware, resident software, micro-code,etc.) or an embodiment combining software and hardware aspects that mayall generally be referred to herein as a “circuit,” “module” or“system.” Furthermore, aspects of the present inventive subject mattermay take the form of a computer program product embodied in one or morecomputer readable medium(s) having computer readable program codeembodied thereon.

Any combination of one or more computer readable medium(s) may beutilized. The computer readable medium may be a computer readable signalmedium or a computer readable storage medium. A computer readablestorage medium may be, for example, but not limited to, an electronic,magnetic, optical, electromagnetic, infrared, or semiconductor system,apparatus, or device, or any suitable combination of the foregoing. Morespecific examples (a non-exhaustive list) of the computer readablestorage medium would include the following: an electrical connectionhaving one or more wires, a portable computer diskette, a hard disk, arandom access memory (RAM), a read-only memory (ROM), an erasableprogrammable read-only memory (EPROM or Flash memory), an optical fiber,a portable compact disc read-only memory (CD-ROM), an optical storagedevice, a magnetic storage device, or any suitable combination of theforegoing. In the context of this document, a computer readable storagemedium may be any tangible medium that can contain, or store a programfor use by or in connection with an instruction execution system,apparatus, or device.

A computer readable signal medium may include a propagated data signalwith computer readable program code embodied therein, for example, inbaseband or as part of a carrier wave. Such a propagated signal may takeany of a variety of forms, including, but not limited to,electro-magnetic, optical, or any suitable combination thereof. Acomputer readable signal medium may be any computer readable medium thatis not a computer readable storage medium and that can communicate,propagate, or transport a program for use by or in connection with aninstruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmittedusing any appropriate medium, including but not limited to wireless,wireline, optical fiber cable, RF, etc., or any suitable combination ofthe foregoing.

Computer program code for carrying out operations for aspects of thepresent inventive subject matter may be written in any combination ofone or more programming languages, including an object orientedprogramming language such as Java, Smalltalk, C++ or the like andconventional procedural programming languages, such as the “C”programming language or similar programming languages. The program codemay execute entirely on the user's computer, partly on the user'scomputer, as a stand-alone software package, partly on the user'scomputer and partly on a remote computer or entirely on the remotecomputer or server. In the latter scenario, the remote computer may beconnected to the user's computer through any type of network, includinga local area network (LAN) or a wide area network (WAN), or theconnection may be made to an external computer (for example, through theInternet using an Internet Service Provider).

Aspects of the present inventive subject matter are described withreference to flowchart illustrations and/or block diagrams of methods,apparatus (systems) and computer program products according toembodiments of the inventive subject matter. It will be understood thateach block of the flowchart illustrations and/or block diagrams, andcombinations of blocks in the flowchart illustrations and/or blockdiagrams, can be implemented by computer program instructions. Thesecomputer program instructions may be provided to a processor of ageneral purpose computer, special purpose computer, or otherprogrammable data processing apparatus to produce a machine, such thatthe instructions, which execute via the processor of the computer orother programmable data processing apparatus, create means forimplementing the functions/acts specified in the flowchart and/or blockdiagram block or blocks.

These computer program instructions may also be stored in a computerreadable medium that can direct a computer, other programmable dataprocessing apparatus, or other devices to function in a particularmanner, such that the instructions stored in the computer readablemedium produce an article of manufacture including instructions whichimplement the function/act specified in the flowchart and/or blockdiagram block or blocks.

The computer program instructions may also be loaded onto a computer,other programmable data processing apparatus, or other devices to causea series of operational steps to be performed on the computer, otherprogrammable apparatus or other devices to produce a computerimplemented process such that the instructions which execute on thecomputer or other programmable apparatus provide processes forimplementing the functions/acts specified in the flowchart and/or blockdiagram block or blocks.

FIG. 5 depicts an example computer system including a virtual addresstransformation unit. A computer system includes a processor unit 501(possibly including multiple processors, multiple cores, multiple nodes,and/or implementing multi-threading, etc.). The computer system includesmemory 507. The memory 507 may be system memory (e.g., one or more ofcache, SRAM, DRAM, zero capacitor RAM, Twin Transistor RAM, eDRAM, EDORAM, DDR RAM, EEPROM, NRAM, RRAM, SONOS, PRAM, etc.) or any one or moreof the above already described possible realizations of machine-readablemedia. The computer system also includes a bus 503 (e.g., PCI, ISA,PCI-Express, HyperTransport®, InfiniBand®, NuBus, etc.), a networkinterface 505 (e.g., an ATM interface, an Ethernet interface, a FrameRelay interface, SONET interface, wireless interface, etc.), and astorage device(s) 509 (e.g., optical storage, magnetic storage, etc.).The computer system also includes a virtual address transformation unit511. The virtual address transformation unit 511 embodies functionalityto implement embodiments described above. The virtual addresstransformation unit 511 may include one or more functionalities thatfacilitate the transformation of virtual addresses based on dynamicallyconfigurable page sizes. Any one of these functionalities may bepartially (or entirely) implemented in hardware and/or on the processingunit 501. For example, the functionality may be implemented with anapplication specific integrated circuit, in logic implemented in theprocessing unit 501, in a co-processor on a peripheral device or card,etc. Further, realizations may include fewer or additional componentsnot illustrated in FIG. 5 (e.g., video cards, audio cards, additionalnetwork interfaces, peripheral devices, etc.). The processor unit 501,the storage device(s) 509, and the network interface 505 are coupled tothe bus 503. Although illustrated as being coupled to the bus 503, thememory 507 may be coupled to the processor unit 501.

While the embodiments are described with reference to variousimplementations and exploitations, it will be understood that theseembodiments are illustrative and that the scope of the inventive subjectmatter is not limited to them. In general, techniques for implementingvirtual memory as described herein may be implemented with facilitiesconsistent with any hardware system or hardware systems. Manyvariations, modifications, additions, and improvements are possible.

Plural instances may be provided for components, operations orstructures described herein as a single instance. Finally, boundariesbetween various components, operations and data stores are somewhatarbitrary, and particular operations are illustrated in the context ofspecific illustrative configurations. Other allocations of functionalityare envisioned and may fall within the scope of the inventive subjectmatter. In general, structures and functionality presented as separatecomponents in the example configurations may be implemented as acombined structure or component. Similarly, structures and functionalitypresented as a single component may be implemented as separatecomponents. These and other variations, modifications, additions, andimprovements may fall within the scope of the inventive subject matter.

What is claimed is:
 1. A method comprising: receiving an indication of a virtual address; reading, from a register, a current page size of a plurality of available page sizes; determining a shift amount based, at least in part, on the current page size; and performing a bit shift of the virtual address to create a transformed virtual address, wherein the virtual address is bit shifted by at least the determined shift amount.
 2. The method of claim 1 further comprising utilizing the transformed virtual address as input to a page table walk technique, wherein the page table walk technique is not dependent on the current page size.
 3. The method of claim 1, wherein determining the shift amount based, at least in part, on the current page size comprises determining the shift amount by at least one of: performing a table lookup, wherein the table maps a value indicating the current page size to a shift amount, calculating the shift amount based on at least one of the value indicating the current page size and a base shift amount, and determining the number of bits set to one in the value indicating the current page size.
 4. The method of claim 1 further comprising using the transformed virtual address as at least one of a set of one or more page identifiers and a page offset.
 5. The method of claim 4 further comprising: calculating a padding amount for at least one of a page identifier of the set of one or more page identifiers and the page offset; and padding at least one of the page identifier of the set of one or more page identifiers and the page offset in accordance with the respective calculated padding amount.
 6. The method of claim 1 further comprising using the transformed virtual address to perform a page table lookup.
 7. The method of claim 1, wherein the virtual address is transformed based, at least in part, on a template, wherein the virtual address transformed in accordance with the template is compatible with a plurality of page table types.
 8. The method of claim 1 further comprising selecting a page table walk technique from a plurality of page table walk techniques, wherein the selection of the page table walk technique is independent of the current page size. 